Efficient Query Processing on Modern Hardware

نویسنده

  • Thomas Neumann
چکیده

Most database systems translate a given query into an expression in a (physical) algebra, and then start evaluating this algebraic expression to produce the query result. The traditional way to execute these algebraic plans is the iterator model: Every physical algebraic operator conceptually produces a tuple stream from its input, and allows for iterating over this tuple stream. This is a very nice and simple interface, and allows for easy combination of arbitrary operators,but it clearly comes from a time when query processing was dominated by I/O and CPU consumption was less important: The iterator interface causes thousands of expensive function calls, degrades the branch prediction of modern CPUs, and ofter results in poor code locality and complex book-keeping. On modern hardware query processing can be improved considerably by processing tuples in a data centric, and not an operator centric, way. Data is processed such that it can be kept in CPU registers as long as possible. Operator boundaries are blurred to achieve this goal. In combination with an code compilation framework this results in query code that rivals the speed of hand-written code. When using these techniques in the HyPer DBMS, TPC-H Query 1 for example can single-threaded aggregated the scale factor 1GB data set in about 68ms on commodity hardware.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Query Processing and Optimization in Modern Database Systems

Relational database management systems, which were designed decades ago, are still the dominant data processing platform. Large DRAM capacities and servers with many cores have fundamentally changed the hardware landscape. Traditional database systems were designed with very different hardware in mind and cannot exploit modern hardware effectively. This thesis focuses on the challenges posed by...

متن کامل

A Cost Model for Data Stream Processing on Modern Hardware

For stream processing application domains, using queries to process or analyze data incoming from potentially endless streams, low latency and high throughput are key requirements. It is not easy to achieve this as many factors influence the actual runtime of query execution plans and one can not measure all of them individually. Therefore, query optimizers try to overcome this hurdle by using ...

متن کامل

FPGA Implementation of JPEG and JPEG2000-Based Dynamic Partial Reconfiguration on SOC for Remote Sensing Satellite On-Board Processing

This paper presents the design procedure and implementation results of a proposed hardware which performs different satellite Image compressions using FPGA Xilinx board. First, the method is described and then VHDL code is written and synthesized by ISE software of Xilinx Company. The results show that it is easy and useful to design, develop and implement the hardware image compressor using ne...

متن کامل

Multi-level Parallel Query Execution Framework for CPU and GPU

Recent developments have shown that classic database query execution techniques, such as the iterator model, are no longer optimal to leverage the features of modern hardware architectures. This is especially true for massive parallel architectures, such as many-core processors and GPUs. Here, the processing of single tuples in one step is not enough work to utilize the hardware resources and t...

متن کامل

Efficiently Compiling Efficient Query Plans for Modern Hardware

As main memory grows, query performance is more and more determined by the raw CPU costs of query processing itself. The classical iterator style query processing technique is very simple and flexible, but shows poor performance on modern CPUs due to lack of locality and frequent instruction mispredictions. Several techniques like batch oriented processing or vectorized tuple processing have be...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2011